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■ Abstract 

^ ■ A basic fact in spectral graph theory is that the nuraber of connected components in an 

undirected graph is equal to the multiplicity of the eigenvalue zero in the Laplacian matrix 
of the graph. In particular, the graph is disconnected if and only if there are at least two 
eigenvalues equal to zero. Cheeger's inequality and its variants provide an approximate version 

■ of the latter fact; they state that a graph has a sparse cut if and only if there are at least two 
eigenvalues that are close to zero. 

It has been conjectured that an analogous characterization holds for higher multiplicities, 
r~| ■ i.e., there are k eigenvalues close to zero if and only if the vertex set can be partitioned into k 

subsets, each defining a sparse cut. We resolve this conjecture positively. Our result provides a 
theoretical justification for clustering algorithms that use the bottom k eigenvectors to embed 
the vertices into U.'^, and then apply geometric considerations to the embedding. 

We also show that these techniques yield a nearly optimal quantitative connection between 

■ the expansion of sets of size w n/k and Xk, the fcth smallest eigenvalue of the normalized 
Laplacian, where n is the number of vertices. In particular, we show that in every graph there 
are at least fc/2 disjoint sets (one of which will have size at most 2n/k), each having expansion 

. at most 0{^/Xklogk). Louis, Raghavendra, Tetali, and Vempala have independently proved a 

slightly weaker version of this last result. The -y/log k bound is tight, up to constant factors, for 
the "noisy hypercube" graphs. 

1 Introduction 

. Let G = {V,E) be an undirected, d-regular graph. Its normalized Laplacian matrix L £ M^^^ is 

. given by L = / — ^A, where A is the adjacency matrix of G. For the moment, we confine ourselves 

I to unweighted, regular graphs, while the results in the paper are presented for arbitrary weighted 

graphs, with suitable changes to L. It is easy to see that L is a positive semi-definite matrix, and 
its eigenvalues satisfy = Ai < A2 < • • • < \v\- Elementary arguments show that the number of 
connected components of G is precisely the multiplicity of the eigenvalue zero, that is, = if 
and only if the graph has at least k connected components. 
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Cheeger's inequality for graphs [AM85, AI086, SJ89] yields a robust version of this fact for 
k = 2. To state it, we introduce some notation. For any subset S V, define the expansion of S 
to be the quantity 

\E{S,S)\ 



MS) 



d\S\ 



where E{S, S) denotes the set of edges of G crossing from S to its complement. We may also define, 
for every A; G N, the k-way expansion constant, 

pcik) = min max{0G(5'i) : i = 1, 2, . . . , A;}, 
Si,S2,---,Sii 

where the minimum is over all collections of k non-empty, disjoint subsets Si, S2, ■ ■ ■ , Sk ^ V. 
Observe that pcik) = if and only if = 0. Cheeger's inequality offers the following quantitative 
connection betwen pg(2) and A2, 

y<PG(2)<v^. (1) 

We remark that the left-hand side follows easily, and the non-trivial content of the connection is 
contained in the right-hand side inequality. 

The discrete version of Cheeger's inequality is proved via a simple spectral partitioning algo- 
rithm. Besides being an important theoretical tool, since their inception spectral methods have 
been used for solving a wide range of optimization problems, from graph coloring [AG83, AK97] to 
image segmentation [SMOO, TM06] to web search [Klc99, BP98]. 

Higher-order Cheeger inequalities. In general, we study higher-order analogs of (1), and 
develop new multi-way spectral partitioning algorithms. A special case of one of our main theorems 
(see Section 3.4 and Theorem 4.9) follows. It offers a strong quantitative version of the fact that 
PG{k) = ^ Afc = 0. 



Theorem 1.1. For every graph G, and every S N, we have 

^' <PG{k)<0{e)y%. (2) 
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This resolves a conjecture of Miclo [Mic08]; see also [DJM12], where some special cases are 
considered. Independent of our work, Tanaka [Tanl2] proved a weaker version of the theorem by 
showing that 

PG{k)<0{3'')^/X~k. 

His proof is constructive but, unlike our approach, it does not provide a polynomial-time algorithm 
for constructing the k sets. We remark that from Theorem 1.1, it is easy to find a partition of the 
vertex set into k non-empty pieces such that every piece in the partition has expansion 0{k^)y/Xk 
(see Theorem 3.8). It is known that a dependence on k in the right-hand side of (2) is necessary; 
see Section 4.3. 

Moreover, our proof is algorithmic and leads to new algorithms for A;- way spectral partitioning. 
This provides a theoretical justification for clustering algorithms that use the bottom k eigenvectors 
of the Laplacian^ to embed the vertices into M*^, and then apply geometric considerations to the 
embedding. See [VM03] for a survey of such approaches. As a particular example, consider the work 



^Equivalently, algorithms that use the top k eigenvectors of the adjacency matrix. 
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of Jordan, Ng and Weiss [NJW02] which apphes a fc-means clustering algorithm to the embedding 
in order to achieve a A:-way partitioning. Our proof of Theorem 1.1 employs a similar algorithm, 
where the A:-means step is replaced by a random geometric partitioning. It remains an interesting 
open problem whether A;-means itself can be analyzed in this setting. 

Finding many sets and small-set expansion. If one is interested in finding slightly fewer sets, 
our approach performs significantly better. 

Theorem 1.2. For every graph G, and every A; G N, we have 

PGik)<0{y/X2klogk). (3) 

If G is planar then, the bound improves to, 

PG{k) < 0{y^k) . (4) 

More generally, if G excludes Kh as a minor, then 

PG{k)<0{h^y%'k). 

We remark that the bound (3) holds with 2k replaced by (1 + 6)k for any 5 > 0, but where 
the leading constant now becomes 6~^; see Corollary 4.2. Louis, Raghavendra, Tetali and Vempala 
[LRTV12] have independently proved a somewhat weaker version of the bound (3), using rather 
different techniques. Specifically, they show that there exists an absolute constant G > 1 such that 
pcik) < 0(VAcfc log A;). 

In particular. Theorem 1.2 has applications to the small-set expansion problem in graphs, 
which is fundamentally connected to the Unique Games Conjecture and many other problems in 
approximation algorithms (see [RSIO, RSTIO]). To capture the expansion of small sets in graphs, 
we define the value, 

ifcik) = min (pdS) . 
S<\V\/k 

Clearly (pcik) < Pcik) for every k £ N. 

Arora, Barak and Steurer [ABSIO] prove the bound, 

^cik'/'''') < 0(VAfclog,n), 

where n = \V\. Note that for k = and e G (0,1), one achieves an upper bound of 0{\/^), 
and this small loss in the expansion constant is crucial for applications to approximating small-set 
expansion. This was recently improved further [GT12, 0W12] by showing that for every a > 0, 



fG{k'~n<0{V{h/a)log,n). 

These bounds work fairly well for large values of k, but give less satisfactory results when k is 
smaller. 

Louis, Raghavendra, Tetali and Vempala [LRTVll] proved that 

^ciVk) < 0{y/Xk logk), 
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and conjectured that \fk could be replaced by k. Theorem 1.2 immediately yields, 



^G(kl2) < 0(VAfe log A;) (5) 

resolving their conjecture up to a factor of 2 (and actually, as discussed earlier, up to a factor of 
1 + 5 for every 6 > 0). 

Moreover, (5) is quantitatively optimal for the noisy hypercube graphs (see Section 4.3), yielding 
an optimal connection between the kih. Laplacian eigenvalue and expansion of sets of size n/k. 

It is interesting to note that in [KLPTll], it is shown that for n- vertex, bounded-degree planar 
graphs, one has = 0{k/n). Thus the spectral algorithm guaranteeing (4) partitions such a 
planar graph into k disjoint pieces, each of expansion O^-s/k/n). This is tight, up to a constant 
factor, as one can easily see for an ^/n x -^/n planar grid, in which case the set of size k, n/k with 
minimal expansion is a ^Jnjk x ^Jnjk subgrid. 

1.1 High-dimensional spectral partitioning 

We now present an overview of the proofs of our main theorems, as well as explain our general 
approach to multi-way spectral partitioning. Let G = iV, E) be an undirected, d-regular graph. To 
begin, for any / : ^ — t- we recall the Rayleigh quotient, 

^ ..._ i:iu,.}eE\\f{^)-f{v)r 

The Dirichlet version of Cheeger's inequality (see Lemma 2.1) proves that for any f : V ^ £2, it 
is possible to find a subset S {v G V : f{v) ^ 0} such that that <Pg{S) < \/2TZg{J)- Thus in 
order to find k disjoint, non-expanding subsets 81,82, ... ,5^ C 1/, it suffices to find k disjointly 
supported functions ^1, ■ ■ ■ ,'4'k '■ ^ ^ (^2 such that TZciipi) is small for each i = 1,2, . . . , k. 

In fact, in the same paper that Miclo conjectured the validity of Theorem 1.1, he conjectured 
that finding such a family {V'j} should be possible [Mic08, DJM12]. We resolve this conjecture and 
prove the following theorem in Section 3.4. 

Theorem 1.3. For any graph G = {V,E) and any A; € N, there exist disjointly supported functions 
tpi,'il)2, ■ ■ ■ ,ipk '■ — >■ such that for each i = 1,2, . . . ,k, we have 

7^G(V'^) <0(^')Afc. 

To prove this, we start with an orthonormal system of eigenfunctions of the Laplacian, 

fi,f2,...Jk:V ^R, 
where fi has eigenvalue Aj. We then construct the embedding F : V ^M.^ given by 

F{v) = {fi{v),f2{v),...,fk{v)). (6) 

Observe that 1Zg{F) < Afc. 

Thus our goal is now to "localize" F on k disjoint regions to produce disjointly supported 
functions ■ipi,ilj2, ■ ■ ■ ,'4'k '■ V ^Mf', each with small Rayleigh quotient. (It is elementary to see that 
for any map ip : V ^ R^', there exists some coordinate j G {1, 2, . . . , A;} such that the M- valued 
map ip{v) = Tp{v)j has TZciip) < TZg{'>P)-) In order to ensure that TZci'^i) is small for each i, we 
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Figure 1: Partitioning according to the radial distance. 



must ensure that each region captures a large fraction of the l"^ mass of and that our localization 
process is sufficiently smooth. 

Isotropy and spreading. The first problem we face is that, in order to find k disjoint regions 
each with large l"^ mass, it should be that the £^ mass of F is sufficiently well-spread. This follows 
from the following isotropy property of F (see Lemma 3.2): For any vector x E S^~^ (the unit 
sphere of M'^), 

Y,{x,F{v))' = l. (7) 

v&V 

On the other hand, it straightforward to check that, 

E 11^(^)11' = ^' 

thus it is impossible for the l"^ mass of F to "concentrate" along fewer than k directions xi, X2, . . . , Xfc G 
A natural approach would be to find (at least) k such directions, and then define, 



F{v) if F{v) has large projection on Xi 
otherwise. 



Unfortunately, this sharp cutoff could make the value 

E mu)-uv)\\\ 

{u,v}&E 

much larger than the corresponding quantity for F. Thus we must pursue a smoother approach for 
localizing F. 

The radial projection distance. Our method of smooth localization depends crucially on defin- 
ing a proper notion of distance between vertices, based on the map F. We would like to think of 
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two vertices u,v £ V as close if their Euclidean distance ||-F('u) — is small compared to their 

norms ||F(ii)||, ||F(f)||. To capture this, we define the radial projection distance via, 



dpiu, v) 



F{u) F{v) 



Fu) Fv) 



Note that a ball in dp corresponds to a cone in M'^; see Figure 1. 

Our goal now becomes to find separated regions Si, . . . ,Sk ^ V in dp, each of which contains 
a large fraction of the i'^ mass of F. If these regions are far enough apart, then there is a way to 
allow Tpi to degrade gracefully off of Si, ensuring that TZciipi) remains small; see Lemma 3.3. 

The isotropy condition (7) gives us the following energy spreading property of dp- If 5 C V, 
then 

diam(5,d^)<i =^ Ell^Wll'^^Ell^Wll'- (8) 

ves vev 

In other words, sets of small di?-diameter cannot contain a large fraction of the £^ mass. This will 
be essential in finding regions {Si}. 

Finding separated regions: Random space partitions. In order to find many separated 
regions, we rely on the theory of random partitions discussed in Section 2.3. Roughly speaking, 
this partitions R'^ (and thus our set of points) randomly into pieces of diameter at most 1/2 so 
that the expected fraction of i"^ mass which is close to the boundary of the partition is small. Thus 
we can take unions of the interiors of the pieces to find separated sets. Furthermore, no set in 
the partition can contain a large fraction of the i"^ mass, due to the spreading property of dp (8). 
This is carried out in Section 3.3. We use these separated sets as the supports of our family {ipi}, 
allowing us to complete the proof of Theorem 1.3. 

The notion of "close to the boundary" depends on the dimension k, and thus the smoothness 
of our maps {ipi} will degrade as the dimension grows. For many families of graphs, however, we 
can appeal to special properties of their intrinsic geometry. 

Exploiting the intrinsic geometry. It is well-known that the shortest-path metric on a planar 
graph has many nice properties, but dp is, in general, not a shortest-path geometry. Thus it is 
initially unclear how one might prove a bound like (4) using our approach. The answer is to combine 
information from the spectral embedding with the intrinsic geometry of the graph. 

We define dp as the shortest-path pseudometric on G, where the length of an edge {u, v} £ E 
is precisely dp{u,v). In Sections 3.2 and 3.3, we show that it is possible to do the partitioning 
in the metric dp, and thus for planar graphs (and other generalizations), we are able to achieve 
dimension-independent bounds in Theorem 1.2. 

This technique also addresses a common shortcoming of spectral methods: The spectral em- 
bedding can lose auxiliary information about the input data that could help with clustering. Our 
"hybrid" technique for planar graphs suggests that such information (in this case, planarity) can 
be fruitfully combined with the spectral computations. 

Dimension reduction. In order to obtain the tight bound (3) for general graphs, we have 
to improve the quantitative parameters of our construction significantly. The main loss in our 
preceding construction comes from the ambient dimension k. 

Thus our first step is to apply dimension-reduction techniques: We randomly project our points 
from into M^Ciog'^). Let F' : V ^ RO{\ogk) ^^le resulting map. While it is easy to see that 



6 



TZg{F') X TZg{F) with high probabihty, it is not, a priori, clear why 0(log k) dimensions suffices for 
maintaining the energy spreading properties of F. Indeed, the isotropy condition (7) will generally 
fail for F' . Although the proof is delicate (see Lemma 4.3), the basic idea is this: li dp satisfies (8), 
but dp' fails to satisfy a related property, then a 3> | fraction of the £^ mass has to have moved 
significantly in the dimension reduction step, and such an event is unlikely for a random mapping 
into 0(log k) dimensions. 

A new multi-way Cheeger inequality. Dimension reduction only yields a loss of 0(logA;) 
in (3). In order to get the bound down to -y/log A;, we have to abandon our goal of localizing 
eigenfunctions. In Section 4.2, we give a new multi-way Cheeger rounding algorithm that combines 
random partitions of the radial projection distance dp, and random thresholding based on 
(as in Cheeger 's inequality). By analyzing these two processes simultaneously, we are able to achieve 
the optimal loss. 

1.2 A general algorithm 

Given a graph G = {V, E) and any embedding F -.V ^M.^ (in particular, the spectral embedding 
(6)), our approach yields a general algorithmic paradigm for finding many non-expanding sets. For 
some r G N, do the following: 

i) (Radial decomposition) 

Find disjoint subsets ^i, S2, . . . , 5^ C y using the values {-F(f )/||F(f )|| : v G V}. 

ii) (Cheeger sweep) 

For each i = 1, 2, . . . , r. 

Sort the vertices Si = {wi, ^2, . . . , Wn^} so that 

\\F{vi)\\<\\F{v2)\\<---<\\F{vnM- 
Output the least-expanding set among the nj — 1 sets of the form, 

for 1 < J < nj — 1. 

As discussed in the preceding section, each of our main theorems is proved using an instantiation 
of this schema. For instance, the proof of Theorem 1.1 partitions using the radial projection distance 
dp- The proof of (4) uses the induced shortest-path metric dp. And the proof of (3) uses dpi where 
F' '.V ^ ]RCi(iogfc) obtained from random projection. The details of the scheme for equation (3) 
is provided in Section 5. A practical algorithm might use r-means to cluster according to the radial 
projection distance. 

We remark that partitioning the normalized vectors as in step (i) is used in the approach of 
[NJW02], but not in some other methods of spectral partitioning (see [VM03] for alternatives). 
Our analysis suggests a theoretical justification for partitioning using the normalized vectors. 
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2 Preliminaries 



Let G = {V, E, w) be a finite, undirected graph, with positive weights w : E ^ (0, oo) on the edges. 
For a pair of vertices u,v eV, we sometimes write w{u,v) for w{{u,v}). For a subset of vertices 
S CV, we write E{S, S) := {{u, v} € E : \{u, v} n S\ = 1}. For a subset of edges F O E, wc write 
w{F) = '^f,^Fw{e). We use x ^ y to denote {x,y} G E. We extend the weight to vertices by 
defining, for a single vertex v eV, w{v) := ^u~yW{u,v). We can think of w{v) as the weiglited 
degree of vertex v. We will assume throughout that w{v) > ioi every v eV. For S CV,we write 
wis) = Et,es^(^)- 

Let X be a set and d : X x X ^ [0, oo] is a symmetric non-negative function which may take 
the value oo. We refer to d as an extended pseudo-metric on X if it satisfies the triangle inequality. 
For a subset S C X, we write diam.{S,d) ■= sup^y^gd{x,y), and for two sets S,T C X, we write 

d{S,T) ■■= inf xe s, y eT d{x,y). We also define the ball B(i{x,R) ■= {y ^ X : d{x,y) < R}. 

For two expressions A and B, we write A < B for A < 0{B) and A>i B for the conjunction of 
A < S and A > S. 

2.1 Spectral theory of the weighted Laplacian 

We write £^ (V, w) for the Hilbert space of functions / : F ^ M with inner product 

{f,9)p(V,w) ■■= ^w{v)f{v)g{v), 

and norm ||/||£2(y^) = {f ■, f) P{v,w)- We reserve (•,•) and || • || for the standard inner product and 
norm on M*, G N and f{V). 

We now discuss some operators on (."^{V^w). The adjacency operator is defined by Af{v) = 
wiu, v)f{u), and the diagonal degree operator by Df{v) = w{v)f{v). Then the combinatorial 
Laplacian is defined hy L = D — A, and the normalized Laplacian is given by 

jCg ■■=i-d-'/''ad~^/\ 

Observe that for an unweighted, d-regular graph, we have Cg = ^L. 
Now, if : y — >■ M is a non-zero function and / = D~^/^g, then 

(gXcg) ^ {g,D-y^LD-y^g) 
{9,9) {9,9) 
{f,Lf) 

Y,w{u,v)\f{u)-f{v)\'' 
^ w{v)f{vf 

where the latter value is referred to as the Rayleigh quotient of f (with respect to G). 
In particular, one sees that Cg is a positive-definite operator with eigenvalues 

= Ai < A2 < • • • < A„ < 2 . 
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For a connected graph, the first eigenvalue corresponds to the eigenfunctions g = D^^'^f, where / 
is any non-zero constant function. Furthermore, by standard variational principles, 

Afc = mm max<^— — : g e span{gi, . . . , gk} 

= min max|7^G(/) : / G span{/i,...,/A;}|, (9) 

where both minimums are over sets of k non-zero orthogonal functions in the Hilbert spaces i'^(V) 
and £"^{¥,1^), respectively. We refer to [Chu97] for more background on the spectral theory of the 
normalized Laplacian. 

2.2 Cheeger's inequality with Dirichlet boundary conditions 

Given a subset S CV hy, we denote the Dirichlet conductance of S by, 

w{E{S,S)) 
'^^^^^ = w{S) ■ 

If H is a Hilbert space, we extend the notion of Rayleigh quotients to arbitrary maps tp : V ^ Ti 
via, 

^w{u,v)\\'ip{u) -'ipiv)\\'^ 
nciip) ■■= ^ . (10) 

In what follows, we use supp('0) ■= {v £ V : tpiv) ^ 0}. 

Many variants of the following lemma are known; see, e.g. [Chu96]. 

Lemma 2.1. For any ^ : 1/ — )■ there exists a subset S Q supp('0) with 



MS) < VtjZgW. 

Proof. Let || • || = || • ||-^. We may assume that supp(V') 7^ V, else taking S = V finishes the 
argument. Since TZcii^) is homogeneous in ij), we may assume that ip : V ^ [—1,1]. Define a 
subset St = {u £ V : \\'ip{u)\\'^ > t}, and let t G (0, 1] be chosen uniformly at random. Observe that 
St ^ supp('i/') by construction. 
Then we have the estimate, 

E[w{St)] = Y.w{u)\mu)f, 
uev 

as well as, 

E[wiE{St,S-t))] = Y.^iu,v)\mu)f-mv)\\ 



2 I 



< 



< 



< 
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Combining these two inequalities yields, 



E[w{E{St,St))] 



implying there exists a t G [0, 1] for which St satisfies the statement of the lemma. 



□ 



2.3 Random partitions of metric spaces 

We now discuss some of the theory of random partitions of metric spaces. Let {X, d) be a finite 
metric space. We use B{x,R) = {y G X : d{x,y) < R} to denote the closed ball of radius R about 
X. We will write a partition P of X as a function P : X ^ 2^ mapping a point x ^ X to the 
unique set in P that contains x. 

For A > 0, we say that P is A-bounded if diam(S') < A for every S £ P. We will also consider 
distributions over random partitions. If is a random partition of X, we say that V is A-bounded 
if this property holds with probability one. 

A random partition V is (A, a, 6)-padded if V is A-bounded, and for every x £ X, we have 



A random partition is (A., L)-Lipschitz if V is A-bounded, and, for every pair x,y G X, we have 



Here are some results that we will need. The first theorem is known, more generally, for doubling 
spaces [GKL03], but here we only need its application to M'^. See also [LN05, Lem 3.11]. 

Theorem 2.2. If X (1 R^, then for every A > and 6 > 0, X admits a (A, 0{k/6), 1 - 5)-padded 
random partition. 

The next result is proved in [CCG^98]. See also [LN05, Lem 3.16]. 

Theorem 2.3. If X CI M.'' , then for every A > 0, X admits a {A,0{^/k))-Lipschitz random 



A partitioning theorem for excluded- minor graphs is presented in [KPR93], with an improved 
quantitative dependence coming from [FT03]. 

Theorem 2.4. If X is the shortest-path metric on a graph excluding Kh as a minor, then for 
every A > and 6 > 0, X admits a (A, 0{h'^ /6), 1 — 6)-padded random partition and a (A, 0(/i^))- 
Lipschitz random partition. 

Finally, for the special case of bounded-genus graphs, a better bound is known [LSIO]. 

Theorem 2.5. If X is the shortest-path metric on a graph of genus g, for every A > and (5 > 0, 
X admits a {A, O {(log g)/ 5), 1 — 5) -padded random partition, and a {A,0(logg))-Lipschitz random 
partition. 



¥[B{x,A/a) C V{x)] > 6. 



F[V{x) V{y)] < L • 



d{x,y) 
A 



partition. 
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3 Localizing eigenfunctions 



Let G = {V, E, w) be a weighted graph. In the present section, we show how to find, for every 
k eN, disjointly supported functions ipi,tjj2, ■ ■ ■ ,tpk '■ V ^ ^ with TZoiipi) < k^^^^Xk, where Afe is 
the fcth smallest eigenvalue of jOg- 

3.1 The radial projection distance 

For /i G N, consider a mapping F : V ^ M.^. A central role will be played by the radial projection 
distance, which is an extended pseudo-metric on V: If ||F(u)||, > 0, then 

F{u) F{v) 



dpiu, v) 



Fu) 



Fv) 



Otherwise, if F{u) = F{v) = 0, wc put dp{u,v) ■= 0, else dp{u,v) ■■= oo. 

In order to find many disjointly supported functions from a geometric representation F : V ^ 
M!^, it should be that the i'^ mass of F is not too concentrated. To this end, we say that F is 
{A, r)) -spreading (with respect to G) if, for all subsets S C.V,we have 

diam(5,dF)<A =^ ^w{u)\\F{u)f < ij^wiu)\\F{u)f . 

First, we record the following simple fact. 
Lemma 3.1. For any F : V R'^, and for allu,v G V , we have dF{u,v)\\F{u)\\ < 2 \\F(u)—F(v)\\. 
Proof. For any non-zero vectors x,y eM.'^, we have 

\\x\ 



X 

\x\\ 



y_ 

\y\\ 



WW 



< \\x - y\\ + 



l|y|| 



< 2 ||x — y\ 



□ 



We now show that systems of itj)-orthonormal functions give rise to spreading maps. 



Lemma 3.2. Suppose that /i,/2,---,/fe : V — M is an £'^{V,w)-orthonormal system and that 
F-.V^W'^is given by F{v) = {fi{v), f2{y), fk{v)). Then, for every A>0,Fis (a, ^^j^) - 
spreading with respect to G. 

Proof. Let x G M*^ be any unit vector, and let [/ : M*^ ^ w) be defined by 

k 

{Ux){v) ■■=^Xi^Jw{v)fi{v) . 

i=l 

Observe that {U'^U)ij = {fi-, fj)p(y,w)j hence U'^U = I. Thus, 

^ w{v){x, F{v)f = {Ux, Ux) = (x, U^Ux) = 1. (11) 
vev 
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Now, let S C y satisfy diam(S', dp) < A. Fix any u £ S and use (11) to write, 



F{u) 
\F{u)\ 



Y.w{v)\\F{v) 



v&V ^ II \ / II / 

The lemma now follows by noting that, 



dp{u, vY 



2\ 2 



> 



k. 



v£V i=l 



vev 1=1 



□ 



3.2 Smooth localization 

Given a map F : y ^ M'^ and a subset 5 C F, we now show how to construct a function supported 
on a small-neighborhood S, which retains the i"^ mass of F on S, and which doesn't stretch edges 
by too much. 

For future applications, it will be useful to consider the largest metric on G which agrees with 
dp on edges. This is the induced shortest-path (extended pesudo-) metric on G, where the length 
of an edge {u,v} € E is given by dp{u,v). We will use the notation dp for this metric. Observe 
that dp > dp since dp is a pseudo-metric. We will write 

Ne{S, dp) ■.= {veV : dp{v, S) < e} 

for the open e- neighborhood of S in the metric dp. 



Lemma 3.3 (Localization). For any F : V - 

number e > 0, there exists a mapping ip : V 



i^, the following holds. For every subset S QV and 
R'^ which satisfies the following three properties: 



i) ipis = F\s, 

a) supp(V') ^ Nir{S,dp), and 
Hi) if {u, v} £ E, then \ip{u) — 'ip{v)\ < (1 + 
Proof. First, define 

9{v) ■■= max 0, 1 



F{u) - F{v) 
dp{v,S) 



In particular, observe that 6 is (l/e)-Lipschitz with respect to dp, so since dp and dp agree on 
edges, we have for every {u,v} € E, 



\e{u)-e{v)\ < -dp{u,v). 



(12) 



Finally, set ij{v) ■= 9{v)F{v). 

Properties (i) and (ii) are immediate from the definition, thus we turn to property (iii). Fix 
{n, v} E E. We have, 

\ip{u) - i^{v)\ = e{u)F{u) - e{v)F{v) 

< \e{v)\-\\F{u)-F{v)\\ + \\F{u)\\-\9{u)-9{v)\. 
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Since 9 < 1, the first term is at most — -F(t')||. Now, using (12), and Lemma 3.1, we have 

||F(n)|| • \e{u) - e{v)\ < i • \\Fiu)\\ ■ dF{u,v) < ^ • ||F(n) - Fiv)\\, 

completing the proof of (iii) . □ 

The preceding construction reduces the problem of finding disjointly supported set functions to 
finding separated regions in (V,dF), each of which contains a large fraction of the i'^ mass of F. 

Lemma 3.4. Let F : V ^ M.^ be given, and suppose that for some /3, 5 > and r G N, there exist r 
disjoint subsets Ti, T2, . . . , T,. C y such that dpiTijTj) > (3 for i ^ j, and for every i = 1,2, . . . ,r, 
we have 

Y,w{v)\\F{v)f > 6Y,Hv)\\F{v)f . (13) 

v&Ti vev 

Then there exist disjointly supported functions ipi, 1^2, ■ ■ ■ ,tpr : V such that for i = 1,2, . . . ,r, 

we have 

2 / 4^ ^ 



Proof. For each i G [r], let : y — )• be the result of applying Lemma 3.3 to the domain Tj with 
parameter e = (3/2. Since dpiTijTj) > /? for i 7^ j, property (ii) of Lemma 3.3 ensures that the 
functions are disjointly supported. 

Additionally property (i) implies that for each i £ [r], 

EMt;)||^.(^^)f > E^WII^(^)II'^'^E^(^)II^WII'' 

vGV v£Ti v£V 

and by property (iii) of Lemma 3.3, and since the supports are disjoint, 

5^^^n,^)||V'.(n) - V'.(t^)f < 2 A + 5]Kn,t;)||F(n) - F(^)f . 

In particular, if we reorder the maps so that TZciipi) ^ '^g(V'2) < • • • < T^ciipr), then the preceding 
two inequalities imply (14). 

These maps {ipi} take values in M.^, but it is easy to see that for any if) : V ^ M.^, there 
exists a coordinate j S {1,2,..., h} such that the map ij) : V ^ M. defined by '4){v) ■= tp{v)j has 
T^cii') ^ T^cii^)- This follows from the general inequality > miuj ^, valid for all 

ai, . . . , Ofc, 61, . . . , 6fc > with some 6j > 0. □ 



3.3 Random partitioning 

From Lemma 3.4, to find many disjointly supported functions with small Rayleigh quotient, it 
suffices to partition (V,dF) into well separated regions, each of which contains a large fraction of 
the i'^ mass of F. We will use a suitable distribution over random partitions and argue that at 
least one partition in the support of the distribution is good for this purpose. 

Lemma 3.5. Let r, /c G N be given with k/2 < r < k, and suppose that the map F : V ^ is 
(A, ^ + ^ g^^-*^ ) -spreading for some A > 0. Suppose additionally there is a random partition V with 
the properties that 
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i) For every S £V, diain{S , dp) < A, and 

ii) For every vGV, F[B^^{v,A/a) C V{v)] > 1 - . 

Then there exist r disjoint subsets Ti, T2, . . . , T,. C 1/ such that for each i / j, we have dpiTi^Tj) > 
2A/a, and for every i = 1,2, ... ,k, 



2k 

Proof. For a subset 5 C define 

~S:= {xeS:B^^{x,A/a)^S}. 

Let £ = X^tjgt/ > 0. By linearity of expectation, there exists a partition P such that 
for every S G P, diam(S', dp) < A, and also 

>(i-^i^^)^. (15) 

Furthermore, by the spreading property of F , we have, for each S G P, 



x&S ^ ^ 



Therefore we may take disjoint unions of the sets {5" : S G P} to form at least r disjoint sets 
T\,T2, . . . ,Ty with the property that for every i = l,2,...,r, we have 

Y^^(y)\\F{v)f>^^E 

because the first r — 1 pieces will have total mass at most 

r - 1 ^ A^-r + 1 N _ ^-r + 1 1 X ^ 



k \ 8r J - \ 4r 2k^ 

for all r G [k/2,k], leaving at least ^ mass left over from (15). □ 

We mention a representative corollary that follows from the conjunction of Lemmas 3.4 and 3.5. 

Corollary 3.6. Let k £ N and 6 G (0, 1) be given. Suppose the map F -.V ^R'' is {A,l + 
spreading for some A < 1, and there is a random partition V with the properties that 

i) For every S £V, diam(S', di?) < A, and 

ii) For every veV, ^B^^{v, A/a) C V{v)] > 1 - ^ . 

Then there are at least r > [(1 — S)k~\ disjointly supported functions ipi,4'2^ . . . ,'ipr ■ V ^ M. such 
that 

Proof. In this case, we set r = [(1 — 6/2)k'\ in our application of Lemma 3.5. After extracting 
at least [(1 — 6/2)k'] sets, we apply Lemma 3.4, but only take the first r' = \{1 — 6)k'] functions 

'lpl,lp2, ■ ■ ■ ,'>Pr'- □ 

Note, in particular, that we can apply the preceding corollary with ^ = '^o obtain r = k. 
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3.4 Higher-order Cheeger inequalities 

We now present some theorems applying our machinery to embeddings which come from the eigen- 
functions of Lq- 

Theorem 3.7. For any 6 G (0, 1), and any weighted graph G = {V, E, w), there exist r > \{l — 5)k~\ 
disjointly supported functions V'l , V'2 > • • • ) V'r '■ V ^ M such that 

T^GiA)<^Xk. (16) 

where \k is the kth smallest eigenvalue of Cq- If G excludes as a minor, then the bound 
improves to 

T^g{A) < ^Afc, (17) 
and if G has genus at most g >l, then one gets 

ncm<'-^^^^x,. (18) 

Proof. Let /i, /2, . . . , /fc : 1^ — ?• M be an £'^{V, t(;)-orthonormal system of eigenfunctions correspond- 
ing to the first k eigenvalues of Cq, and define F : V ^M.^ hy F{v) = {fi{v), f2{v), . . . , fk{v))- 

Choose A X so that (1 - A^)-^ < 1 + ^. In this case, Lemma 3.2 implies that F is 
(A, ^ + ^)-spreading. Now, for general graphs, since dp is Euclidean, we can use Theorem 2.2 
applied to dp to achieve a x k/5 in the assumptions of Corollary 3.6. Observe that dp > dp, so 
that {v, A/a) C B^p {v,A/a), meaning that we can satisfy both conditions (i) and (ii), verifying 
(16). 

For (17) and (18), we use Theorems 2.4 and 2.5, respectively, applied to the shortest-path metric 
dp. Again, since dp > dp, we have that diam{S, dp) < A implies diam{S, dp) < A, so conditions 
(i) and (ii) are satisfied with a x h'^/d and a x log{g + I)/ 6, respectively. □ 

We remark that in Section 4.1, we will give an alternate bound of 0(6^'^ log^ k) ■ for (16), 
which is better for moderate values of 6. 

Finally, we can use the preceding theorems in conjunction with Lemma 2.1 to produce many 
non-expanding sets. 

Theorem 3.8. (Nan- expanding k-partition) For any weighted graph G = {V,E,w), there exists a 
partition V = SiU S2U ■ ■ ■ U Sk such that 



MSi)<kW^k. 

where Xk is the kth smallest eigenvalue of Co- If G excludes Kh as a minor, then the hound 
improves to 

MSi)<h^k''v%, 
and if G has genus at most g >1, then one gets 

MSi)<log{g + l)k^^/X~k. 
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Proof. First apply Theorem 3.7 with = ^ to find disjointly supported functions • • • j V'fc : 

y — )• M satisfying (16). Now apply Lemma 2.1 to find sets 5i, 5*2, ■ ■ ■ ,Sk with Si C supp(V'j) and 
(pciSi) < yj2TiG{ipi) for each i = 1, 2, . . . , fc. 

Now reorder the sets so that w{Si) < w(S'2) < • • • < w{Sk), and replace with the larger set 
S'f, = V \ (Si U 52 U • • • U Sk-i) so that V = Si U S2 U ■ ■ ■ U Sk-i U 5^ forms a partition. One can 
now easily check that 

^ wiEiS'„Y,)) ^ Eti ^iE{Si,s;)) ^ , 

MSk) = T^TT^ < LA ^ < k ■ maxcPciSk) < k^^JXk ■ 

w{S^) w{S^) *=i 

A similar argument yields the other two bounds. □ 

Using Theorem 3.7 in conjunction with Lemma 2.1 again yields the following. 

Theorem 3.9. For every 6 £ (0, 1) and any weighted graph G = (V, E, w), there exist r > \{l — 6)k] 
disjoint sets Si, S2, ■ ■ ■ , Sr Q V such that, 

MS^)<^V^k. (19) 

where is the kth smallest eigenvalue of Co- If G excludes as a minor, then the bound 
improves to 

and if G has genus at most g >1, then one gets 



52 V ■ 

We remark that the bound (19) will be improved, in various ways, in Section 4. 



4 Improved quantitative bounds 

A main result of this section is the following theorem. 

Theorem 4.1. Let G = (y,E,w) be a weighted graph and let k £ {1,2, . . . ,n} and S £ (0, 1) be 
given. Suppose that /i, /2, . . . , /a; : — M forms an £'^{V, w)-orthonormal system. Then there exist 
r >\{1 — 5)k~\ disjoint sets Si, S2, . . . , Sr V with 



'>G{Si) < 



Ef=iE„..M«,^)(/.(«)-/.(^)Z.iogfc 



Corollary 4.2. For any weighted graph G = {V, E, w), k £ {1, 2, . . . , n}, and 6 £ (0, 1), there exist 
r > \{1 — 6)k'] disjoint sets Si, S2, . . . , Sr V with 

(t>G{Si) < ^V-^fc logfe, 
where Xk is the kth smallest eigenvalue of Cg- 
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4.1 Dimension reduction 

One should observe that in Theorems 3.7 and 3.9, the loss of k"^ in (16) and k in (19) comes from the 
dimension of the eigenfunction embedding. To achieve somewhat better bounds for general graphs, 
we now show how to drastically reduce the dimension while preserving the Rayleigh quotient and 
spreading properties. 

Let gi,g2, ■ ■ ■ ,gh be i.i.d. fc-dimensional Gaussians, and consider the random mapping T^^h '■ 
R'^ — )• defined by Tk,hix) = h~^/'^{{gi,x), {g2,x), . . . , {gt, x)). Then we have the following basic 
estimates (see, e.g. [Mat02, Ch. 15] or [LTll, Ch. 1]). For every x G R^, 

E[\\Tk,hi^)f]=\\xf, (20) 

and, for every 6 G (0, 



^ TkM^W ^ [(1 - S)\\xr, (1 + 6)\\xr]\ < 2e-^''^/i2 ^ ^21) 
and for every A > 2, 

F[\\Tk,h{^)f>M\xf] <e-^^/^\ (22) 

Lemma 4.3. Let G = {V,E,w) be a weighted graph. For every k £ N, A £ [0,1], and rj > 1/k, 
the following holds. Suppose that F : V ^ M.^ is {IS., rf)- spreading. Then for some value 

l + log(fc) + log(^) 
A2 

with probability at least 1/2, the map Tk^h satisfies both of the following conditions: 
i) nG{Tk,hoF) <8-nG{F), and 

ii) ^k,h o F is (A/4, (1 + A)r]) -spreading with respect to G. 

Proof. Let 6 = A/16. We may assume that k > 2. Choose x (1 + logfc + log(^))/A2 large 
enough such that 2e-^'^/^2 < s-^k-^/ug. Let T = rk,h- 

First, observe that (20) combined with Markov's inequality implies that the following holds 
with probability at least 3/4, 

Y,Hu,v)\\riF{n)) - r{F{v))f < 4 • ^ T.;(n, ^)||F(..) - F{v)f . (23) 

Now define, 

U:={v£V: mF{v))f G [(1 - 6)\\Fiv)f,{l + 6)\\F{v)\n 
By (21), for each v gV, 

F[v^U]< 6k-^ /128 . (24) 

Next, we bound the amount of i'^ mass that falls outside of U. 

Therefore, by Markov's inequality, with probability at least 31/32, we have 

^ w{v)\\F{v)f < ^ E ^WII^WII' • (25) 
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In particular, with probability at least 31/32, we have 

w{v)\\r{Fiv))f >{1-6)Y^ w{v)\\F{v)f > (1 - 26) J] w{v)\\F{v)f . (26) 



Combining our estimates for (23) and (26), we conclude that (i) holds with probability at least 
23/32. Thus we can finish by showing that (ii) holds with probability at least 25/32. We first 
consider property (ii) for subsets of U. 

Claim 4.4. With probability at least 7/8, the following holds: Equation (26) implies that, for any 
subset 5" C [7 with diam(5) < A/4, we have 

Y,Hv)mF{v))f < (1 + 65)7? Hv)\\r{F{v))f . 

Proof. For every u,v G V, define the event, 

Au,v = {dr(F){u,v) G [dF{u,v){l - 6) - 26,dF{u,v){l + 6) + 26]} 

and let Iu,v be the random variable indicating that Au,v does not occur. 
We claim that for u,v G V, Au v occurs if u,v £ U, and 



F{u) F{v) 



\Fiu)\\ \\Fiv) 
To see this, observe that, 

r(F(n)) T{F{v)) 



G [(1 - 6)dF{u, v), (1 + 6)dF{u, v)] 



> 



> 



||r(F(n))|| WTiFiv 
TiF{u)) T{F{v)) 



F{u) 



\F{v)\\ 
F(v 



r(F(n)) TiFiu)) 



\F{u)\\ \\F{v) 
> {1 - 6)dF{u,v) -26, 



\T{F{u) 
-26 



F u) 



r{F{v)) TiFiv)) 



WTiFiv) 



F v) 



where we have used the fact that F is a linear operator. The other direction can be proved similarly. 
Therefore, by (21), and a union bound, for any u,v £V, F[Iu,v] < 36k~^ /128. Let, 

£i:= Y wiu)wiv)\\Fiu)f\\Fiv)flu,v. 

By linearity of expectation, and Markov's inequality, we conclude that 

\ 2" 



_6_ 

4fc3 



< 



1 



(27) 



Now suppose there exists a subset S C U with diam(5, (ir(F)) < A/4 and 

YHv)\\riFiv))f > (1 + 65)r/ Y Hv)mFiv))f . 
veS vev 
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Fix a vertex u £ S. Since for every v G S\Bdp{u, A/2), we have dpiu, v) > A/2, (ip(p)(n, v) < A/4, 
and recalling that (5 = A/16, it must be that Iu,v = 1- On the other hand, we have 

J2 Hv)\\Fiv)f > Y.'^iv)\\Fiv)f - ^i^)\\nv)f 

> (1 - 6)Y,w{v)\\r{F{v))f - rj Hv)\\F{v)f 

> (1 - 5)(1 + 66)v w{v)\\r{F{v))f - V E ^i^)\\nv)f 

(26) , 

> [(1 - 26){l -5){1+ 66) -l]riY "'Wll^Wf 

> 67^Y^iv)\\F{v)f, 

where we have used the fact that S QU and also diam(Bff^ (n, A/2)) < A and the fact that F is 
(A, 77)-spreading. In the final line, we have used 5 < 1/16. 

Thus under our assumption on the existence of S and again using S" C [/, we have 

£i > 5^t/;M||F(n)f Y ^WII^WII' 

ueS v£S\Bap{u,A/2) 

> 5r?(5]^t;)||F(^;)fW«;(^)||F(^)f 

> 5r^ [Y w{v)\\F{v)\A {1-5)Y ^(^)l|r(F(^))f 

\vev / u&s 

> 5(1 - 5)r? [Y ^iy)\\Fiv)\n (1 + 65)r? ( J] w{v)\\r{Fi 



v]]\\'^ 



(26) ^ ^ 

> 5(1 - 6){1 - 26){1 + 65)r?2 j ^ w{v)\\F{v)f 



\vev / 



2 



where the last inequality follows from r] > 1/k and 6 < 1/16. Combining this with (27) yields the 
claim. □ 

The preceding claim guarantees a spreading property for subsets S C U. Finally, we need to 
handle points outside U. 

Claim 4.5. With probability at least 15/16, we have 

E w{v)\\r{F{v))f < dk-^ E ^(^)\\nv)f . 
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Proof. Let P„ be the event that u ^U, and let ■= ||r(F(n))|plx)„. Then, 



E 



^w{u)\\T{F{i 



.u4U 



(28) 



Now we can estimate, 



Using the inequahty, vahd for ah non-negative X, 



l|r(F(n)) 
\\F(uW 



\Y{F(u))f>2\\F{u) 



(29) 



/■oo 

P(X > Ao) • E[X I X > Ao] < / A • P(X > A) dA . 
we can bound the latter term in (29) by. 



A • P (||r(F(n))f > \\\F(u)\f^ dX < Ae-^'^/i^ dX=(^ 



24 144 



-h/& 



< 



128P ' 



where we have used (22) and the initial choice of h sufficiently large. 
It foUows from this, (29), and (24), that 



36 



128P 



\F(u) 



Therefore, by Markov's inequality. 



5]u;(t;)||r(F(^))|P>5fc-3^u;(^)||F( 
.v4u vev 



< 



128 



completing the proof. 



□ 



To conclude the proof of the lemma, we need to verify that (ii) holds with probability at least 
25/32. But observe that if (26) holds, then the conclusion of the preceding claim is, 

Yw{v)\\r{F{v))f <6k'^Y.''(''^\\^('^)\\' ^^^^^^ 



v4U 



Combining this with Claim 4.4 shows that with probability at least 25/32, ToF is (A/4, {l + 75)r])- 
spreading, completing the proof. □ 

As an application of the preceding lemma, observe that we can improve (16) in Theorem 3.7 
to the following bound, which is sometimes stronger, using the essentially same proof, but first 



obtaining a spreading representation F : V 



■ log k) 



using Lemma 4.3. 



Theorem 4.6. For any weighted graph G = {V, E, w) and 5 > the following holds. For every 
k £ 'N, there exist r > [(1 — d)k~\ disjointly supported functions ipi,ip2, . . . jipr '■ V ^ R. such that 



nG{i^i)<6-''iogHk + i)Xk. 

where Xk is the kth smallest eigenvalue of Cg- 



(30) 
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Proof. Let f i, f 2, fk '■ V ^ W he an i'^ {V, ii;)-orthonormal system of eigenfunctions correspond- 
ing to the first k eigenvalues of Cg, and define F : V ^M.^ hy F{v) = {fi{v), f2{v), ■ ■ ■ , fk{v))- 

We may clearly assume that S>^. Choose A x 5 so that (1 - 16A2)-1(1 + 4A) < 1 + ^. In 
this case, for some choice of 

^ l + log(fc) + log(^) ^ O(logfe) 

with probability at least 1/2, Tk,h satisfies the conclusions of Lemma 4.3. Assume that F : M.^ — )• M'' 
is some map satisfying these conclusions. 

Then combining (ii) from Lemma 4.3 with Lemma 3.2, we see that ToF:V^ is (A, ^ + ^)- 
spreading. Now we finish as in the proof of Theorem 3.7, using the fact that h = 0(5-^ log k). □ 



4.2 A multi-way Cheeger inequality 

Note that Theorem 4.6 combined with Lemma 2.1 is still not strong enough to prove Theorem 
4.1. To do that, we need to combine Lemma 4.3 with a strong Cheeger inequality for Lipschitz 
partitions. 

Let G = {V,E,w) be a weighted graph, and F : ^ R'^. Set M = max{||F(v)||2 :veV}. Let 
r G (0, M) be chosen uniformly at random, and for any subset S CLV, define 

S = {veS: \\F{v)f >t}. 

Lemma 4.7. For every A > 0, there exists a partition V = SiU S2U ■ ■ ■ U Sm such that for every 
i G [m], diam(S'j, di;') < A, and 



E 


wiEiSuSi))+w{E{S2,S2)) + --- + 


w{E{Sm,Sm)) 


E 


w{Si) H h w{Sm) 





<^.yn^). (31) 



Proof. Since the statement of the lemma is homogeneous in F, we may assume that M = 1. By 
Theorem 2.3, there exists an A-bounded random partition V satisfying, for every u,v £ V, 



F{V{u)^Viv))<^-dp{u,v). 

Let P = S*! U 5*2 U • • • U Sm, where we recall that m is a random number. 
First, observe that, E[t(;(S'j)] = X^^g^- thus. 



(32) 



E 



w{Si) H h vu{S„ 



^wiv)\\F{v) 



(33) 
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Next, if {u,v} E E with < then we have 

{u,v} £ EiSi,Ti)U---UE{SmX^ 

< p [V{u) / V{v)] ■ P > T or \\F{v)f > r | V{u) ^ V{v)] 
+ F[Te[\\Fiu)f,\\F{v)f]\Viu)=V{v) 

< ^ • dHu,v) (||F(n)f + \\F{v)f) + \\F{v)f - \\F{u)f 

< i\\F{u)\\ + ||F(^)||) ( ^ • dHu,v)i\\F{u)\\ + ||F(^;)||) + ||F(^)|| - 



< 



5Vh 



A 



\Fiu)\\ + \\Fiv)\\)\\Fiu)-Fiv) 



where in the final hne we have used Lemma 3.1. 
Thus, we can use Cauchy-Schwarz to write. 



E 



w{E{Si,Si)) + --- + wiEiSrr.,S^)) 



< 



Vh 



Y,^in,v)mu)\\ + \\Fiv)\\)\\F{u)-F{v) 



< lY^nj{u,v){\\F{n)\\ + \\Fiv) 

• lY,w{n,v)\\Fiu)-FivW 



< 



Vh 



l2^w{v)\\F{vW Y.''{u,v)\\F{u)-F{vW. 



Combining this with (33) yields, 





w{E{SuSi)) + ■■■+ w{E{Sm, 5™)) 


E 


w{Si) + ■ ■ ■ + w{Sm) 





< 
~ A 



Vh Eu^v^i^,v)\\F{u)-F{vW 



E.ev^i^)\\nv) 



where we use E-p to denote expectation over the random choice of "P. In particular, there must 
exist a single partition P satisfying the statement of the lemma. □ 

We can use the preceding theorem to find many non-expanding sets, assuming that F : V ^ M.^ 
has sufficiently good spreading properties. 

Lemma 4.8. Let G = (V, E, w) be a weighted graph and let k £ N and 6 G (0, 1) be given. If the 
map F : V ^M'^ is (A, ^ + spreading, then there exist r > \{l—6)k] disjoint sets , T|, . . . , T* , 
such that 
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Proof. Let V = S*! U 5*2 U • • • U Sm be the partition guaranteed by applying Lemma 4.7 to the 
mapping F : V ^ R'^. Set £ ■= X^'ueV ■ Since F is (A, ^ + ^)-spreading and each 

Si satisfies diam(S'j, dj?) < A, we can form r' > \{1 — 5/2)k'] sets Ti,T2, . . . ,Tr' by taking disjoint 
unions of the sets {Si} so that for each i = 1, 2, . . . , r', we have 



2k 



<Y,w{v)\\F{v)f<Ul+^-^ . 



In particular, E[?i;(rj)] = J2veTi 



e[M,(l + |)f]- 



Order the sets so that E[w{E{Ti,Ti))] < E[w{E{Ti+i,Ti+i))] for i = 1,2, 
r = \{1 — 5)k~\ . Then from (31), it must be that each i = 1, 2, . . . , r satisfies 



1, and let 



^wiEiS,,S,)) 



But E[w(Tj)] ^ £ /k for each i = 1, 2, . . . , r, showing that 

E[w{E{fi,ti))] ^ Vh 



□ 



We can already use this to improve (19) in Theorem 3.9. 

Theorem 4.9. For every 6 G (0, 1) and any weighted graph G = (V, E, w), there exist r > \(l — 6)k} 
disjoint, non-empty sets Si, S2, ■ ■ ■ , Sr ^ V such that, 



(34) 



where is the kth smallest eigenvalue of Cg- 

Proof Let A X \/5 be such that (1 - A^)"! < 1 + |. If we take F : F M'' to be the embedding 
coming from the first k eigenfunctions of Cq, then Lemma 3.2 implies that F is (A, ■|+^)-spreading. 
Now apply Lemma 4.8. □ 

Observe that setting 6 = in the preceding theorem yields Theorem 1.1. 
And now we can complete the proof of Theorem 4.1. 

Proof of Theorem 4.1. Let F{v) = (/i(u), /2(w), . . . , /fc(f )). Choose A x 5 so that (1- 16A2)-1(1 + 
4A) < 1 + |. In this case, for some choice of 



1 + log(fc) + log (^) O(logfc) 



A2 



52 
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with probability at least 1/2, Tk,h satisfies the conclusions of Lemma 4.3. Assume that F : M'^ — t- M'* 
is some map satisfying these conclusions. 

Then combining the conclusions of Lemma 4.3 with Lemma 3.2, we see that F* ■■= r is (A, i + 
^)-spreading, takes values in M'*, and satisfies TZciF*) < 8 • TZc{F). Now applying Lemma 4.8 
yields the desired result. □ 



4.3 Noisy hypercubes 

In the present section, we review examples for which Corollary 4.2 is tight. For k £ N and e € (0, 1) 
let iffc.e = {y,E) be the "noisy hypercube" graph, where V = {0,1}'^, and for any x,y €V there 
is an edge of weight w{x,y) = ell^~2'lli. We put n = |y| = 2^^. 

Theorem 4.10. For any 1 < C < k and k G'N, and S QV with \S\ < Cn/k, we have 



<pH,JS)>y/\klog{k/C), 



where e 



log(2) 



log(fc/C) ■ 

Proof. Let H = Hk^e- First, the weighted degree of every vertex is 



w{x) = ^elN-s/lli = + 
Therefore, if we define : F — t- M by Fi{x) = (—1)^'% then 

Thus \k{H) < 2e. We will now show that for \S\ < Cn/k, one has (pniS) > ^, completing the 
proof of the theorem. 

To bound (j^ni-), we need to recall some Fourier analysis. For f,g:{0,l}^^M. define the inner 
product: 

x&{0,l}'' 

Given S C [A;], the Walsh function Ws : {0,1}'= ^ M is defined by Ws{x) = (-l)^«es^\ The 
Walsh functions form an orthonormal basis with respect to the above inner product. Therefore, 
any function / : {0, l}'^ — M has a unique representation as / = J2sc[n] f{^)^S^ where f{S) ■= 
{f,Ws)L^V)- 

For rj E [0, 1], the Bonami-Beckner operator is defined as 

SC[n] 

The Bonami-Beckner inequality [Bon70, Bec75] states that 

2 

Y: ri^'mS? = II will < ||/||?+, = I ^ E f(^y^'] • (35) 
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Let A be the normalized adjacency matrix of H, i.e. Axy = ^i^^^k ■ It follows from an elementary 
calculation that Ws is an eigenvector of A with eigenvalue (^^)''^', i.e. 

AWs^[—^ Ws. 

For S" C [n], let I5 be the indicator function of S. Therefore, 



{1s,A1s)l^^v) = E ^s{Tf i-^j < \\ls 



TOn] 



2 



(V) 



where the one last inequality follows from (35). 
Now, observe that for any 5" C y, we have 

w{E{S,S)) = w{S) - w{E{S, S)) = w{S) - (1 + e)''n{ls, Als) ^2 

where we have written E(S, S) for edges with both endpoints in S. 
Hence, for any subset 5 C 1/ of size \S\ < Cn/k, we have 

where the last inequality follows by the choice of e = log(2)/log (k/C). □ 

Remark 4.1. The preceding theorem shows that even if we only want to find a set S of size n/^/k, 
then for values of k < O(logn), we can still only achieve a bound of the form (pH^S) < y/Xk log k. 
The state of affairs for k ^ log n is a fascinating open question. 



5 Conclusion 

In Section 1.2, we gave a generic outline of our spectral partitioning algorithm. We remark that 
our instantiations of this algorithm are simple to describe. As an example, suppose we are given 
a weighted graph G = {V,E,w) and want to find k disjoint sets, each of expansion 0{^/X2hAogk) 
(recall Theorem 1.2). We specify a complete randomized algorithm. 

One starts with the spectral embedding F : y — )• M'^, given by F{v) = (/i(f ), f2{v), . . . , f2k{v)), 
where /i,/2, • • • , /2A: is the £^(y, i(;)-orthogonal system comprised of the first 2k eigenfunctions of 
the normalized Laplacian. Then, for some h = 0(log/c), we perform random projection into M!^. 
Let T2k,h '■ — ^ be the random linear map given by 

^2k,h{x) = ((51,2;), . . . , {gh,x)) , 

where {gi, . . . ,gh} are i.i.d. standard Gaussians. We now have an embedding F* ■= T2k,h ° F : 

Next, for some R = 0(1), we perform the random space partitioning algorithm from [CCGG98]. 
Let B denotes the closed Euclidean unit ball in M^. Consider V C. B hy identifying each vertex 



25 



with its image under the map v i— )• -F*(t')/||i^*(t')||. If {xi,X2, . . .} is an i.i.d. sequence of points in 
B (chosen according to the Lebesgue measure) , then we form a partition of V into the sets 

oo 

V = \J^Vn B{xi, R) \ {B{xi,R) U • • • U B{xi^i, R)) 
1=1 

Here, B{x, R) represents the closed Euchdean bah of radius R about x, and it is easy to see that this 
induces a partition of ^ in a finite number of steps with probabihty one. Let V = SiL) ■ ■ - D Sm 
be this partition. 

Finahy, for a subset 5 C y, let £{S) = Ylvi^s (^)lP- We sort the partition {Si, S2, ■ ■ ■ , Sm} 

in decreasing order according to £{Si). Let k' = [|A;] . Then for each i = k' + l,k' + 2, . . . ,m, we 
iteratively set S^ ■= S^U Si where 

i = argmin{<S(S'j) : j < k} . 

(Intuitively, we form k' sets from our total m > k' sets by balancing the £^(-)-value among them.) 
At the end, we are left with a partition V = SiU S2U ■ ■ ■ U Sk' of V into k' > 3k/2 sets. 
To complete the algorithm, for each i = 1, 2, . . . , A;', we choose a value r such that 

Si = {veS^:\\F*{v)f>T} 

has the least expansion. We then output k of the sets Si, S2, ■ ■ ■ , Sk' that have the smallest expan- 
sion. 

The preceding algorithm suggests some natural questions. First, does dimension reduction help 
to improve the quality of clusterings in practice? For instance, if one runs the fc-means algorithm 
(as in [NJW02]) on the randomly projected points, does it yield better results? Another interesting 
question is whether, at least in certain circumstances, the quality of the /c-means clustering can be 
rigorously analyzed when used in place of our random geometric partitioning. 



References 

[ABSIO] Sanjeev Arora, Boaz Barak, and David Steurer. Subexponential algorithms for unique 
games and related problems. In FOCS, pages 563-572, Washington, DC, USA, 2010. 
IEEE Computer Society. 3 

[AG83] Bengt Aspvall and John R. Gilbert. Graph coloring using eigenvalue decomposition. 
Technical report, Ithaca, NY, USA, 1983. 2 

[AK97] Noga Alon and Nabil Kahale. A spectral technique for coloring random 3-colorable 
graphs. SI AM Journal on Computing, 26:1733-1748, 1997. 2 

[AI086] N Alon. Eigenvalues and expanders. Combinatorica, 6:83-96, January 1986. 2 

[AM85] N. Alon and V. Milman. Isoperimetric inequalities for graphs, and superconcentrators. 
Journal of Combinatorial Theory, Series B, 38(l):73-88, feb 1985. 2 

[Bec75] William Beckner. Inequalities in Fourier analysis. Ann. of Math. (2), 102(1):159-182, 
1975. 24 



26 



[Bon70] Aline Bonami. Etude des coefficients de Fourier des fonctions de U'{G). Ann. Inst. 
Fourier (Grenoble), 20(fasc. 2):335-402 (1971), 1970. 24 

[BP98] S. Brin and L. Page. Tlie anatomy of a large-scale hypertextual web search engine. In 
Proceedings of the seventh International Wide Web Conference, 1998. 2 

[CCG+98] M. Charikar, C. Chekuri, A. Goel, S. Guha, and S. Plotkin. Approximating a finite 
metric by a small number of tree metrics. In Proceedings of the 39th Annual IEEE 
Symposium on Foundations of Computer Science, 1998. 10 

[CCGG98] M. Charikar, C. Chekuri, A. Goel, and S. Guha. Rounding via trees: deterministic 
approximation algorithms for group Steiner trees and A:-median. In 30th Annual ACM 
Symposium on Theory of Computing, pages 114-123. ACM, New York, 1998. 25 

[Chu96] F. R. K. Chung. Laplacians of graphs and Cheeger's inequalities. In Combinatorics, 
Paul Erdos is eighty, Vol. 2 (Keszthely, 1993), volume 2 of Bolyai Soc. Math. Stud., 
pages 157-172. Janos Bolyai Math. Soc, Budapest, 1996. 9 

[Chu97] Fan R. K. Chung. Spectral graph theory, volume 92 of CBMS Regional Conference Series 
in Mathematics. Published for the Conference Board of the Mathematical Sciences, 
Washington, DC, 1997. 9 

[DJM12] Amir Daneshgar, Ramin Javadi, and Laurent Miclo. On nodal domains and higher- 
order Cheeger inequalities of finite reversible markov processes. Stochastic Processes 
and their Applications, 2012. 2, 4 

[FT03] J. Fakcharoenphol and K. Talwar. An improved decomposition theorem for graphs 
excluding a fixed minor. In Proceedings of 6th Workshop on Approximation, Random- 
ization, and Combinatorial Optimization, volume 2764 of Lecture Notes in Computer 
Science, pages 36-46. Springer, 2003. 10 

[GKL03] Anupam Gupta, Robert Krauthgamer, and James R. Lee. Bounded geometries, frac- 
tals, and low-distortion embeddings. In ^^t/i Symposium on Foundations of Computer 
Science, pages 534-543, 2003. 10 

[GT12] Shayan Oveis Gharan and Luca Trevisan. Approximation the expansion profile and 
almost optimal local graph clustering. arXiv:1204.2021, 2012. 3 

[Kle99] Jon M. Kleinberg. Authoritative sources in a hyperlinked environment. Journal of the 
ACM, 46:668-677, 1999. 2 

[KLPTll] J. Kelner, J. R. Lee, G. Price, and S.-H. Teng. Metric uniformization and spectral 
bounds for graphs. Geom. Funct. Anal., 21(5):1117-1143, 2011. 4 

[KPR93] Philip N. Klein, Serge A. Plotkin, and Satish Rao. Excluded minors, network decompo- 
sition, and multicommodity flow. In Proceedings of the 25th Annual ACM Symposium 
on Theory of Computing, pages 682-690, 1993. 10 

[LN05] James R. Lee and Assaf Naor. Extending Lipschitz functions via random metric parti- 
tions. Invent. Math., 160(l):59-95, 2005. 10 



27 



[LRTVll] Anand Louis, Prasad Raghavendra, Prasad Tetali, and Santosh Vempala. Algorithmic 
extensions of Cheeger's inequality to higher eigenvalues and partitions. In APPROX- 
RANDOM, pages 315-326, 2011. 3 

[LRTV12] Anand Louis, Prasad Raghavendra, Prasad Tetali, and Santosh Vempala. Many sparse 
cuts via higher eigenvalues. In STOC, 2012. 3 

[LSIO] J. R. Lee and A. Sidiropoulos. Genus and the geometry of the cut graph. In Proceedings 
of the 21st ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 193-201, 
2010. 10 

[LTll] Michel Ledoux and Michel Talagrand. Probability in Banach spaces. Classics in Math- 
ematics. Springer- Verlag, Berlin, 2011. Isoperimetry and processes, Reprint of the 1991 
edition. 17 

[Mat02] J. Matousek. Lectures on discrete geometry, volume 212 of Graduate Texts in Mathe- 
matics. Springer- Verlag, New York, 2002. 17 

[Mic08] Laurent Miclo. On eigenfunctions of Markov processes on trees. Probability Theory and 
Related Fields, 142(3-4):561-594, 2008. 2, 4 

[NJW02] Andrew Ng, Michael Jordan, and Yair Weiss. On spectral clustering: Analysis and an 
algorithm. In NIPS'02, 2002. 3, 7, 26 

[0W12] Ryan O'Donnell and David Witmer. Improved small-set expansion from higher eigen- 
values. arXiv: 1204.4688, 2012. 3 

[RSIO] Prasad Raghavendra and David Steurer. Graph expansion and the unique games con- 
jecture. In STOC, pages 755-764, New York, NY, USA, 2010. ACM. 3 

[RSTIO] Prasad Raghavendra, David Steurer, and Prasad Tetali. Approximations for the isoperi- 
metric and spectral profile of graphs and related parameters. In STOC, pages 631-640, 
New York, NY, USA, 2010. ACM. 3 

[SJ89] Alistair J. Sinclair and Mark R. Jerrum. Approximative counting, uniform generation 
and rapidly mixing Markov chains. Information and Computation, 82(1):93-133, 1989. 
2 

[SMOO] Jianbo Shi and Jitendra Malik. Normalized cuts and image segmentation. IEEE Trans. 
Pattern Anal. Mach. IntelL, 22(8):888-905, 2000. 2 

[Tanl2] Mamoru Tanaka. Higher eigenvalues and partitions of a graph. arXiv:1112.3434, 2012. 
2 

[TM06] David A. Tolliver and Gary L. Miller. Graph partitioning by spectral rounding: Ap- 
plications in image segmentation and clustering. In CVPR '06: Proceedings of the 
2006 IEEE Computer Society Conference on Computer Vision and Pattern Recogni- 
tion, pages 1053-1060. IEEE Computer Society, 2006. 2 

[VM03] Deepak Verma and Marina Meila. Comparison of spectral clustering methods. Technical 
Report UW-CSE-03-05-01, Department of Computer Science, University of Washing- 
ton, March 2003. 3, 7 



28 



